42 research outputs found

    PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems

    Full text link
    Machine Learning models are often composed of pipelines of transformations. While this design allows to efficiently execute single model components at training time, prediction serving has different requirements such as low latency, high throughput and graceful performance degradation under heavy load. Current prediction serving systems consider models as black boxes, whereby prediction-time-specific optimizations are ignored in favor of ease of deployment. In this paper, we present PRETZEL, a prediction serving system introducing a novel white box architecture enabling both end-to-end and multi-model optimizations. Using production-like model pipelines, our experiments show that PRETZEL is able to introduce performance improvements over different dimensions; compared to state-of-the-art approaches PRETZEL is on average able to reduce 99th percentile latency by 5.5x while reducing memory footprint by 25x, and increasing throughput by 4.7x.Comment: 16 pages, 14 figures, 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI), 201

    Development of micro-tubular perovskite cathode catalyst with bi-functionality on ORR/OER for metal-air battery applications

    Get PDF
    As rechargeable metal-air batteries will be ideal energy storage devices in the future, an active cathode electrocatalyst is required with bi-functionality on both oxygen reduction reaction (ORR) and oxygen evolution reaction (OER) during discharge and charge, respectively. Here, a class of perovskite cathode catalyst with a micro-tubular structure has been developed by controlling bi-functionality from different Ru and Ni dopant ratios. A micro-tubular structure is achieved by the activated carbon fiber (ACF) templating method, which provides uniform size and shape. At the perovskite formula of LaCrO3, the dual dopant system is successfully synthesized with a perfect incorporation into the single perovskite structure. The chemical oxidation states for each Ni and Ru also confirm the partial substitution to B-site of Cr without any changes in the major perovskite structure. From the electrochemical measurements, the micro-tubular feature reveals much more efficient catalytic activity on ORR and OER, comparing to the grain catalyst with same perovskite composition. By changing the Ru and Ni ratio, the LaCr0.8Ru0.1Ni0.1O3 micro-tubular catalyst exhibits great bi-functionality, especially on ORR, with low metal loading, which is comparable to the commercial catalyst of Pt and Ir. This advanced catalytic property on the micro-tubular structure and Ru/Ni synergy effect at the perovskite material may provide a new direction for the next-generation cathode catalyst in metal-air battery system.Publisher PDFPeer reviewe

    Towards accelerating generic machine learning prediction pipelines

    No full text
    Machine Learning models are often composed by sequences of transformations. While this design makes easy to decompose and accelerate single model components at training time, predictions requires low latency and high performance predictability whereby end-to-end runtime optimizations and acceleration is needed to meet such goals. This paper shed some light on the problem by using a production-like model, and showing how by redesigning model pipelines for efficient execution over CPUs and FPGAs performance improvements of several folds can be achieved

    Fully Distributed Multicast Routing Protocol for IEEE 802.15.8 Peer-Aware Communication

    No full text
    The IEEE 802.15.8 provides peer-aware communication (PAC) protocol for peer-to-peer infrastructureless service with fully distributed coordination. One of the most promising services in IEEE 802.15.8 is group multicast communication with simultaneous membership in multiple groups, typically up to 10 groups, in a dense network topology. Most of the existing multicast techniques in mobile ad hoc networks (MANET) have significant overhead for managing the multicast group and thus cannot be used for fully distributed PAC networks. In this paper, we propose a light-weight multicast routing protocol referred to as a fully distributed multicast routing protocol (FDMRP). The FDMRP minimizes routing table entries and thus reduces control message overhead for its multicast group management. To balance the control message, all nodes in the network have a similar number of routing entries to manage nodes in the same multicast group. To measure the effectiveness of the proposed FDMRP against the existing schemes, we evaluated performance by OPNET simulator. Performance evaluation shows that the FDMRP can reduce the number of routing entries and control message overhead by up to 85% and 95%, respectively, when the number of nodes is more than 500

    FPGA Implementation of Efficient CFAR Algorithm for Radar Systems

    No full text
    The constant false-alarm rate (CFAR) algorithm is essential for detecting targets during radar signal processing. It has been improved to accurately detect targets, especially in nonhomogeneous environments, such as multitarget or clutter edge environments. For example, there are sort-based and variable index-based algorithms. However, these algorithms require large amounts of computation, making them difficult to apply in radar applications that require real-time target detection. We propose a new CFAR algorithm that determines the environment of a received signal through a new decision criterion and applies the optimal CFAR algorithms such as the modified variable index (MVI) and automatic censored cell averaging-based ordered data variability (ACCA-ODV). The Monte Carlo simulation results of the proposed CFAR algorithm showed a high detection probability of 93.8% in homogeneous and nonhomogeneous environments based on an SNR of 25 dB. In addition, this paper presents the hardware design, field-programmable gate array (FPGA)-based implementation, and verification results for the practical application of the proposed algorithm. We reduced the hardware complexity by time-sharing sum and square operations and by replacing division operations with multiplication operations when calculating decision parameters. We also developed a low-complexity and high-speed sorter architecture that performs sorting for the partial data in leading and lagging windows. As a result, the implementation used 8260 LUTs and 3823 registers and took 0.6 μs to operate. Compared with the previously proposed FPGA implementation results, it is confirmed that the complexity and operation speed of the proposed CFAR processor are very suitable for real-time implementation

    Improved interface control for high-performance graphene-based organic solar cells

    No full text
    The demand for high-efficiency flexible optoelectronic devices is ever-increasing because next-generation electronic devices that comprise portable or wearable electronic systems are set to play an important role. Graphene has received extensive attention as it is considered to be a promising candidate material for transparent flexible electrode platforms owing to its outstanding electrical, optical, and physical properties. Despite these properties, the inert and hydrophobic nature of graphene surfaces renders it difficult to use in optoelectronic devices. In particular, commonly used charge transporting layer (CTL) materials for organic solar cells (OSCs) cannot uniformly coat a graphene surface, which leads to such devices failing. Herein, this paper proposes an approach that will enable CTL materials to completely cover a graphene electrode; this is done with the assistance of commonly accessible polar solvents. These are successfully applied to various configurations of OSCs, with power conversion efficiencies of 8.17% for graphene electrode-based c-OSCs (OSCs with conventional structures), 8.38% for i-OSCs (OSCs with inverted structures), and 7.53% for flexible solar cells. The proposed approach is expected to bring about significant advances for efficiency enhancements in graphene-based optoelectronic devices, and it is expected that it will open up new possibilities for flexible optoelectronic systems

    Improved Interface Control for High-Performance Graphene-Based Solar Cells

    No full text
    The demand for high-efficiency flexible optoelectronic devices is ever-increasing because next-generation electronic devices that comprise portable or wearable electronic systems are set to play an important role. Graphene has received extensive attention as it is considered to be a promising candidate material for transparent flexible electrode platforms owing to its outstanding electrical, optical, and physical properties. Despite these properties, the inert and hydrophobic nature of graphene surfaces renders it difficult to use in optoelectronic devices. In particular, commonly used charge transporting layer (CTL) materials for organic solar cells (OSCs) cannot uniformly coat a graphene surface, which leads to the inevitable devices failing. Herein, this paper proposes an approach that will enable CTL materials to completely cover a graphene electrode; this is done with the assistance of commonly accessible polar solvents

    The effect of the graphene integration process on the performance of graphene-based Schottky junction solar cells

    No full text
    With the rise of graphene, its applications as the active component in various types of solar cells, such as transparent conductors, additives, or interfacial charge transport layers, have been intensively investigated. Among them, graphene-based Schottky junction solar cells have been rapidly developed due to their relatively simple device structures compared to conventional p-n junction type solar cells. Through various modifications such as chemical doping, antireflection coating, and interfacial oxide layer control, a power conversion efficiency of over 15% was successfully reported. However, graphene-based Schottky junction type solar cells often suffer from s-shaped current density-voltage characteristics, which leads to the inevitable performance degradation, particularly for the fill factor. In this work, we investigate the origin of such aforementioned behaviors and propose a facile approach to suppress the s-shape character in the operation of graphene-based Schottky junction solar cells. Through the careful modulation of the graphene integration process, the interfacial charge recombination seemed to be significantly suppressed leading to a notably improved device performance (from 0.8% to 12.5%). Our findings shall provide valuable insights into the operating principle of graphene-based Schottky junction solar cells, which can play an important role as one of the primary suppliers of next-generation renewable clean energy
    corecore